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Abstract 

Monte-Carlo simulations are routinely used for es- 
timating the scaling exponents of complex systems. 
However, due to finite-size effects, determining the 
exponent values is often difficult and not reliable. 
Here we present a novel technique of dealing with 
the problem of finite-size scaling. This new method 
allows not only to decrease the uncertainties of the 
scaling exponents, but makes it also possible to 
determine the exponents of the asymptotic correc- 
tions to the scaling laws. The efficiency of the tech- 
nique is demonstrated by finding the scaling expo- 
nent of uncorrelated percolation cluster hulls. 

1 Introduction 

Determining the scaling exponents from the finite- 
size simulation data is a very common task in the 
physics of complex systems. In particular, this 
technique is widely used in the context of phase 
transitions, surface roughening, turbulence, granu- 
lar media, etc, c.f. reviews [IJ[SJ[3]. Typically, such 
finite-size Monte-Carlo studies involve extrapola- 
tion of the simulation data towards infinity. Unless 
there is some theoretical understanding about the 
functional form of the finite-size corrections to the 
asymptotic scaling laws of the particular system, 
such an extrapolation carries a risk of underesti- 
mating the uncertainties. In some cases, it may be 
helpful to increase the computation time and sys- 
tem size, and optimize the simulation scheme (c.f. 
[4]). However, this is not always feasible, because 
the convergence to the asymptotic scaling law may 
be very slow, c.f. |5j. Additional difficulties arise, 
when one needs to determine the exponents of the 



finite-size correction terms (c.f. 0), or when the 
asymptotic power law includes a logarithmic pre- 
factor. 

In what follows, we describe a novel technique for 
determining scaling exponents from the finite-size 
simulation data. First, we describe in which form 
the scaling law is expected to hold, and review the 
traditional method. Then, we introduce the basic 
idea which allows us to improve qualitatively the 
precision of the finite-size Monte-Carlo studies, the 
idea of studying simultaneously multiple physical 
quantities that asymptotically scale with the same 
exponent, but have different finite-size correction 
terms. After that, we describe the novel method to 
analyze Monte-Carlo simulation data for extracting 
the scaling exponents and the finite-size correction 
terms. Finally, we provide an example application 
of the technique and find the scaling exponent of 
the uncorrelated percolation cluster hulls. A com- 
parison is offered with the naive application of fit- 
ting to the asymptotic scaling law without consid- 
ering the finite-size correction terms. 

2 The asymptotic scaling law 

Let us consider a system (possibly idealized, mod- 
eling a real one), which is characterized by its size 
x, assuming that the smallest possible value of x 
plays the role of the unit length. 

Further, suppose that the mathematical expec- 
tation of a certain physical quantity scales as 



(L(x)) 



x » 1; 



(1) 



*The final publication is available at www.epj.org 



here, the angular braces denote averaging over the 
full ensemble of the model systems. The Monte- 
Carlo simulations can be used to estimate the val- 
ues of the mathematical expectation (TTJ) for several 
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system sizes x% < X2 < • • • < x n , denoted as 

Ci = L (xi), i = 1 ...n, (2) 

and the variances of them as of; the bar over a 
symbol denotes averaging over a set of Monte-Carlo 
simulations. Then, a least-square fit can be used to 
obtain the scaling exponent a, c.f. [2|. However, 
it is often difficult to estimate the uncertainty of 
the obtained result, because the magnitude of the 
finite-size corrections A within 

(L (x)) = Ax a + A (x) (3) 

is unknown. Of course, one can plot hiCi versus 
lnxi and determine such a crossover point i = k 
that for i > k, the data points lay within their 
statistical uncertainties on a straight line. Then, 
only the data points with i > k will be used for 
finding the exponent a. However, one can easily 
underestimate the adequate value of k, because the 
statistical fluctuations just happen to compensate 
the finite-size corrections A. On the other hand, 
taking excessively large values of k would inflate the 
variance of the outcome. Finally, in some cases, the 
decay rate of the corrections A can be very slow, 
so that the method outlined above will fail at the 
first step — there is no linear range of the graph. 

To resolve these problems we are going to make 
a series of assumptions. Later we will see that the 
method we develop here also validates these as- 
sumptions as it is applied and so the assumptions 
don't have to be tested externally. 

First, we assume a more complex scaling law 
for the mathematical expectation of the physical 
quantity L, in the form 

00 

(L(x)) = J2A k x a ", (4) 

k=l 

assuming that the most significant (in the sense of 
contributing to the C{) members of the sum come 
first. The greatest of the exponents is the a we 
are looking for. We separate to first members and 
rewrite the sum as 

m 

(L(x)) = J2 A ^ ak + A (*)- ( 5 ) 

k=l 

This form for the finite-size correction terms has 
been used previously, c.f. [TJ. 



Second, we assume that the contribution of A 
to Ci is smaller than their statistical fluctuation. 

Now we can apply the least-squares fit to search 
for the 2m parameters, A k and a k , k — l...m. 
However, there are a few problems. Unless we 
have some underlying idea about the parameters, 
the least-squares search is complicated — to of the 
parameters are non-linear and the search space is 
huge with many local minima. We need at least 
n > 2m + 1 data points, all at different system sizes 
— increasing computational complexity. Also, we 
can't be sure the assumptions we have made so far 
are actually correct (aside from the chi-square test 
that is designed to test data probability rather than 
the model). 

3 Different physical quantities 

Our method is designed to resolve these problems; 
it will work, if the following third condition is sat- 
isfied. 

Third, we assume that it is possible to find more 
than one physical quantity with similar scaling be- 
havior. So, we assume that instead of having just 
one quantity, we can define to distinct (linearly in- 
dependent in the finite scale) quantities, the math- 
ematical expectations (Lj) (J = 1 . . .to) of which 
asymptotically scale using the same exponent a, 
but also have the same exponents a k (k = 1 . . . m, 
so we have the same number of exponents as phys- 
ical quantities) for the finite-size correction terms: 

m 

(L, (x)) = A jk x a * + Aj (x) , j = l... to. 

k=l 

(6) 

We denote £y = Lj (xi) with corresponding covari- 
ances S^; = Cov Cu); these covariances can 
be easily calculated during the Monte- Carlo simu- 
lations. For each system size we then have a covari- 
ance matrix Xj = (Eifc;) fc; , i = l...n; with corre- 
sponding inverse matrices Wj = S,^ 1 = {wikl)kv 
A least-squares fit can now be done by minimizing 

n m / rn \ / rn 

i=l j,k=l \ 1=1 /V (=1 

(7) 

which at minimum is of chi-square distribution with 
nm — to 2 — to degrees of freedom. We have re- 
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duced the necessary calculation complexity as we 
now only need n > to + 2 different system sizes. 
Further, the distinct physical quantities that scale 
using the same exponents can be calculated from 
the same system instance within the Monte-Carlo 
simulations. 

The minimization problem is still non-linear in 
to parameters and now with total of to 2 + to pa- 
rameters. We found it yields well to the Leven- 
berg-Marquardt algorithm, given proper initial val- 
ues. However, with inadequate initial values, it can 
still lead to inconsistent results and local minima. 

It is trivial that more data should yield a better 
result. The third assumption shows how to get this 
data and how it is done at no extra computational 
cost. Next we look into how to consistently apply 
this "free" data to yield better results. 

4 Description of the method 

To simplify the problem we rewrite eq. © in ma- 
trix form, with L = ((Lj (x))) , A = (Aj k ) , X = 
(x°"°) , A = (Aj (x)), and derive 

L = AX + A, 

X = A^L-A^A = BL + 5, 1 1 

where B = A -1 and S = — A _1 A. A single row 
from this equation is 

m 

x a " = B kj (L 3 (x)) +6 k , k = 1 . . . to. (9) 
j=i 

We remark here that as Aj are small, so are the S k - 
We now attempt to find the parameters B k j by 
treating this as a least-squares fitting problem. For 
this, we construct a function 



( x i Ej=l CjCij 



(10) 



The weighting factor s 2 is simply the variance of 
the expression within the parentheses: 

(m \ m 

3=1 J fe/=l 

(11) 

We minimize the function S (d) in relation to the 
parameters C\, . . . , C m . Aside from the weighting 



factor s 2 , that depends on the values C k , this is 
a simple linear-least-squares problem. We found 
that by initially setting C k to 1 and iteratively run- 
ning the linear-least-squares algorithm, then near 
the minima of S (d) the function value converges in 
three or four iterations. 

Considering the assumptions made, it is clear 
that near d — a k the function S (d) should have a 
minimum. Conversely, if the function S (d) has ex- 
actly to clear minima, our assumptions about the 
scaling law must be correct and values of are 
exactly where S (d) has minima. Hence, we have 
found a way to extract the values ctk from the func- 
tion S(d). 

For statistical testing, the vectors 
{Cii, . . . , Cim) , i — 1 • • • n must be of multi- 
variate normal distribution. Satisfying this, at 
minima the function S (d) is of chi-square dis- 
tribution with n — m — 1 degrees of freedom. 
Consequently, just as with (J7]), we must have 
n > to + 2. To accept the exponents «fc as 
significant, a chi-square test must be performed: 
at minima the function S (d) has to satisfy the 
relation 



S (a k ) < Xn- 



(P) 



(12) 



where x\ Q f (p) ^ s ^ ne quantile at p of the chi-square 
distribution with n — m — 1 degrees of freedom 
(dof.). 

Aside from the exponents a^, we can also find 
their uncertainties Aa^ from 

S(a k ±Aa k )=S(a k ) + xl(p), (13) 

here we are making use of the constant chi-square 
boundary as the confidence limit — Aa k is deter- 
mined by the width of the dip at the minimum of 
S(d), at level S(a fe )+X? (p). 

In case we are uncertain about the results, we 
can always revert back to ([7]) . We found that when 
doing so, the parameters derived using the above 
described novel method perform flawlessly as initial 
values for this non-linear minimization problem and 
results yielded by the classical but complex are 
the same. 

Compared to (|7|). where we have a nonlinear 
multidimensional minimization problem, the novel 
method contains a linear one-dimensional search. 
This gives us consistent results as we don't have 
to deal with local minima. Furthermore, each of 
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Figure 1: Square bond percolation lattice. Bonds 
(bold solid line segments) are randomly placed into 
the lattice. Clusters are formed by bonds that are 
connected to each other. The largest cluster in the 
center is illustrated with its hull (the zig-zag line) 
and the unscreened perimeter (the dotted line). 

the correction exponents is statistically tested sep- 
arately, instead of one big sum in ([7]l - we have 
found that this excludes invalid results that would 
otherwise pass. 

5 Example application 

As an example of the techniques described, we 
calculate the scaling exponent of the hull of the 
uncorrelated percolation cluster. The percolation 
problem deals with the structures that form by 
randomly placing elementary geometrical objects 
(spheres, sticks, sites, bonds, etc.) either freely into 
continuum, or into a fixed lattice (fig. [TJ. Two ob- 
jects are said to communicate, if their distance is 
less than some given Ao, and communicating ob- 
jects form bigger structures called clusters. Perco- 
lation theory studies the formation of clusters and 
their properties. The more interesting aspect is 
when and how does an infinite cluster form. This 
depends on the lattice site occupation probabil- 
ity. The minimum site occupation probability when 
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Figure 2: Some of the different physical quantities 
that scale with the same exponent as the hull. 

an infinite cluster appears is called the percolation 
threshold. Near this probability, the percolation 
model displays critical behavior and long-range cor- 
relations. For the square bond percolation model 
we use here, this critical probability is p = 0.5. 

Percolation theory is used to study and model a 
wide variety of phenomena, for example fluid flow 
in a porous medium [H], thermal phase transitions 
and critical behavior in magnetism with dilute Ising 
models [5]. 

Several structures can be identified in conjunc- 
tion with a percolation cluster. For example, the 
cluster itself, the hull and the unscreened perimeter 
(fig. [T|) . Aside from these, many others are known 
such as the oceanic coastline [5], the backbone or 
the chemical (shortest) distance. Near the perco- 
lation threshold, all of these structures are fractals 
and can be characterized by scaling exponents. 

In this example, we concentrate on the scaling ex- 
ponent of the hull of uncorrelated percolation clus- 
ters. The exact value of this scaling exponent is 
known, d H = 1.75 [TO]; c.f. HQ. 

First, we identify the different physical quantities 
(from here on, the properties of the hull) that scale 
together with the hull. They are (see fig. 0) 

• bonds - the number of distinct bonds the hull 
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touches, 

• segs - the number of segments in the hull zig- 
zag, 

• ends - the number of distinct bonds touched 
by the hull that have no connections on one 
end, 

• sides - the number of distinct bonds that are 
touched by the hull from both sides, 

• lines - the number of occurrences of four 
straight segments in the hull, 

• corners - the number of times bonds form cor- 
ners in the hull, 

• ones - the number of unset bonds by the hull 
that have exactly one set bond connected to 
them, 

• twos - the number of unset bonds by the hull 
that have exactly two set bonds connected to 
them, 

• threes - the number of unset bonds by the hull 
that have exactly three set bonds connected to 
them. 

It is possible to visualize how the scaling of these 
properties converges towards the dn — 7/4. From 



C-xf, C {i+1)j ~C-x%^ (14) 



where C is some constant. Dividing these two equa- 
tions yields us 



fiii=>d,~ln^^/ln^±i. (15) 



In simulations one often takes Xj+i = 2a;,;, and plac- 
ing the intermediate exponent at yjXi+iXi, we get 



dj (y/XiXi+i) = ln : 



C 



(x l+ i = 2xi) 



(16) 

The convergence of the nine studied properties 
towards the value dn = 1-75 can be seen in fig. El 
The finite-size effects are well pronounced for small 
system sizes. This data is practically unusable for 
the simple model ^ — there is no linear range 
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Figure 3: Convergence of the scaling exponents of 
the hull properties towards dn = 1.75. 



for the data values and any attempt will fail at the 
chi-square test. 

Some of the properties converge faster than oth- 
ers. Our method is designed to work even with the 
very slowly converging properties. Hence, to show 
its efficacy, out of the nine studied, we have se- 
lected the five worst converging properties for what 
follows (sides, threes, bonds, twos, ones). 

We run a Monte-Carlo simulation to gather data 
(the values Cij and where i — 1 . . . n and 

j,k = l...m: n > m + 2). This is done by 
tracing instances of hulls within the confines of a 
system-sized box (fig. 2]). The system sizes used 
were 8, 16, . . . , 256. At each system size 4.2 x 10 6 
different hulls were generated and their properties 
counted. 

Once we have the data, we try out different vari- 
ations of m physical quantities and find an instance 
of 5* (d) that matches our requirements (has m clear 
minima that all satisfy the chi-square test with 
71 — in — 1 degrees of freedom). One such com- 
bination (with m — 4) can be seen in fig. [SJ The 
rightmost peak is at the exponent a we are looking 
for and we can determine its statistical uncertainty 
using relation (TTSl) . 

The number of exponents extractable is un- 
known, so different values of m must be tested. The 
chi-square test at the peaks may fail if the statis- 
tical uncertainty in Cij is comparable to Aj (xi) 
within ©. In such a case we must discard sim- 
ulated data from the smallest system and possi- 
bly run Monte-Carlo simulations for an additional 
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Figure 4: Monte-Carlo simulation system instance 
for scale length X{. We start from the center 
(marked by a dot) of box (for simplified 

bond coordinates we use 45 degrees rotated lattice) 
and trace the hull until it reaches an edge. Bond 
values are calculated dynamically on the way (from 
a simple boolean random generator for the uncorre- 
cted percolation). We reject hulls that make a loop 
and so don't reach an edge. Various hull properties 
are counted (for Cij ) and their cross-multiplications 
are calculated (for £jj&). This is repeated for mil- 
lions of times for a single system size and the re- 
sulting data is aggregated. Finally, Cij and S.^fe 
are calculated. 
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Figure 5: A sample uncorrelated percolation hull 
exponent fitting function S (d) using four different 
properties (to = 4) of the percolation cluster hull 
(twos, segs, sides, ends). The dips in the graph cor- 
respond to the exponents in ([5]). For this particu- 
lar example, they are a\ = 1.7494 ± 0.0019, u-i — 
0.756±0.018, a 3 = -0.04±0.16 and a 4 = -1.73 ± 
0.75. 



larger system. When discarding smaller systems, 
the constitution of the first to members in (O may 
change — some members may only be significant 
for the smaller systems. When that happens we 
may lose one or more of the minima and have to 
decrease to. Parameter to also determines the num- 
ber of degrees of freedom for the overall system (as 
we take n = to + 2), hence while increasing m will 
decrease the contribution of the leftover finite-size 
correction terms to the error (systematic error), it 
may at the same time slightly increase the purely 
statistical uncertainty of the results. 

We can now compare the results from using the 
simple model (eq. (|3])) against the one one with 
to different properties (eq. (J5])). Results can be 
seen in table [TJ The method offers correct results 
(within the confines of the statistical uncertainty), 
high precision (small uncertainty) and consistent 
results [each accepted S (d), that is each combina- 
tion of hull properties, yields similar results]. 

To be fair the gathered data is actually unusable 
for the simple model. This is due to the finite-size 
correction terms. To make use of the simple model 
^ we would have to gather data at much larger 
system sizes. To reach similar results (low statisti- 
cal error) to the novel method would demand vastly 
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Name 


Smallest Adn 


Largest Ad/r 


1 


LSQ 3 


1.7299 ± 0.0066 


1.653 ±0.031 


2 


LSQ 4 


1.720 ±0.011 


1.619 ±0.044 


3 


MLSQ 2 


1.7491 ±0.0011 


1.7488 ± 0.0011 


4 


MLSQ 3 


1.7492 ± 0.0017 


1.7492 ± 0.0017 


5 


MLSQ 4 


1.7494 ± 0.0019 


1.7492 ± 0.0018 



Table 1: Results comparing fitting to the simple 
model versus the novel method (Ad// is the 
difference between the calculated and the known 
value). Only first 6 data points at 8, 16, . . . , 256 
are used. LSQ N - regular least squares fitting 
against model ([3]) with one hull property and N 
system sizes. MLSQ M - method described in this 
paper, with M different hull properties and M + 2 
system sizes (as M increases so does the system's 
degrees of freedom, hence the uncertainty grows). 
Uncertainties are given with 0.95 confidence. Note 
that none of the LSQ results passed the chi-squarc 
test. The novel method offers consistent and accu- 
rate results. 



greater computational costs. 

Aside from the scaling exponent of the hull, we 
have also tested the method to calculate the ex- 
ponents of the unscreened perimeter djj — 4/3 
and the cluster dc = 91/48 and obtained simi- 
lar results to what has been demonstrated above; 
the novel algorithm performed flawlessly for all the 
cases. Finally, we have also studied the case of 
correlated percolation, when the scaling exponents 
depend on the roughness (Hurst) exponent H, so 
that dn — da (H). It is analytically known that 
djj (0) = 1.5 [T3]; we have used our method to re- 
cover this result with a high degree of precision [13 1. 

In earlier studies [7J Q~5], the correction 
term exponents have been conjectured theoreti- 
cally. When compared to these studies, our results 
confirm the presence of the simple correction terms 
(resulting from how we determine the diameter of 
a cluster and also from constant offsets to the mea- 
surements of hull properties) . The inherent correc- 
tion exponents described in those papers attributed 
to percolation cluster scaling have not been found 
here. The most likely explanation is that they were 
statistically insignificant. 



6 Conclusion 

A novel and universal method of determining the 
scaling exponents via finite-size Monte-Carlo simu- 
lations has been devisecj^. The method can be ap- 
plied, if it is possible to find m > 2 distinct quanti- 
ties with equal asymptotic scaling exponents. The 
basic idea is to exploit the equality of the exponents 
of finite-size correction terms within the different 
physical quantities. 

As an example, we have used the method to find 
the scaling exponents of the uncorrelated percola- 
tion cluster hulls. Here the method offered consis- 
tent results and increases the accuracy of the scal- 
ing exponent estimates. The method has also been 
used previously in various contexts in the field with 
good results, c.f. [SI IT3]. 

The method is particularly useful when the con- 
vergence to the asymptotic scaling law is slow as 
it vastly reduces computational costs compared to 
traditional methods. We can make use of small sys- 
tem sizes that with traditional methods yield erro- 
neous results or fail altogether. Also, the method 
is extremely useful, if it is necessary to find the 
exponents of the finite-size correction terms. 
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